Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify access to MCR-DL #17

Merged
merged 2 commits into from
Apr 22, 2024
Merged

Simplify access to MCR-DL #17

merged 2 commits into from
Apr 22, 2024

Conversation

RadhaGulhane13
Copy link
Contributor

@RadhaGulhane13 RadhaGulhane13 commented Feb 26, 2024

Previously:

  1. Initialization for distribution requires the user to make application-level changes for different backends (nccl/mpi) in each distributed module (torch/mcr-dl), as shown here: https://github.com/OSU-Nowlab/MCR-DL/blob/main/benchmarks/utils.py#L61.
  2. The application code becomes messy when accessing the distributed module, and it also requires exporting args or knowing args.dist every time when accessing dist, as shown below:
if args.dist == 'torch':
    import torch.distributed as dist
elif args.dist == 'mcr_dl':
    import mcr_dl as dist

Now:

  1. Initialize dist using mcr_dl.init_processes() replacing torch.distributed.init_process_group() in application.
  2. Access dist by dist = mcr_dl.get_distributed_engine() instead of torch.distributed as dist in application.

Signed-off-by: Radha Guhane <gulhane.2@buckeyemail.osu.edu>
Copy link

github-actions bot commented Feb 26, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

Signed-off-by: Radha Guhane <gulhane.2@buckeyemail.osu.edu>
@RadhaGulhane13 RadhaGulhane13 force-pushed the simplify-mcr-dl-access branch from 8b7f2f6 to d76dd9c Compare April 20, 2024 01:11
Copy link
Member

@Quentin-Anthony Quentin-Anthony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RadhaGulhane13
Copy link
Contributor Author

I have read the CLA Document and I hereby sign the CLA

@Quentin-Anthony Quentin-Anthony merged commit 47c97a8 into main Apr 22, 2024
1 check failed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants